Spatial entries in GIS can be represented as a vector or as a raster
You see below how river can be represented as a vector (left) and as a raster (right)
Vector layers are sets of geometries associated with non-spatial attributes
The geomteries are sequences of one or more coordinates, connected to form lines or polygons
The non-spatial attributes come as a table
Vector layers can be:
Points are just dots that have a coordinate pair:
(i.e. latitude - X and longitudes - Y)
Here are for example all the restaurants in central Rome
Polylines are a sequence of two or more coordinate pairs - vertices.
Roads and rivers are typically stored as polylines in GIS
Here are for example all the roads in central Rome
Polygons are three or more line segments whose starting and ending coordinate pairs are the same.
Countries are usually depicted as polygons
Here are for example all the buildings in central Rome
Here are for example all countries in the world
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
You can download shapefiles for different countries by going to https://gadm.org
We can do the same for Italy
The most common vector files are:
| Type | Format | File extension |
|---|---|---|
| Binary | ESRI Shapefile | .shp, .shx, .dbf, .prj |
| GeoPackage (GPKG) | .gpkg |
|
| Plain Text | GeoJSON | .json or .geojson |
| GPS Exchange Format (GPX) | .gpx |
|
| Keyhole Markup Language (KML) | .gpx |
|
| Spatial Databases | PostgreSQL / PostGIS |
sp packageThe first R package to establish a uniform vector layer class system was sp (2005-2023).
Together with rgdal (2003-2023) and rgeos (2011-2023), the sp package dominated the landscape of spatial analysis in R
sp| Class | Geometry type | Attributes |
|---|---|---|
SpatialPoints |
Points | - |
SpatialPointsDataFrame |
Points | data.frame |
SpatialLines |
Lines | - |
SpatialLinesDataFrame |
Lines | data.frame |
SpatialPolygons |
Polygons | - |
SpatialPolygonsDataFrame |
Polygons | data.frame |
We will avoid the use of sp, rgdal, and rgeos and focus on the sf package.
The rgdal package is deprecated starting with 2023.
You should avoid using those packages (i.e. you will get errors and maybe conflicts with other libraries).
However, it is important to be aware of these packages, given how many forums, articles, and dependencies have been using them.
sf packagesf is the standard package for working with vector data in R.
It relies on external software components: GDAL, GEOS, and PROJ.
It contains three classes:
sfg — a single geometrysfc - a geometry column: a set of sfg geometries with CRS informationsf - a layer which is an sfc column inside a data frame with non-spatial attributes.sf classHere is a polygon layer with three features and six non-spatial attributes: BIR74, SID74, NWBIR74, BIR79, SID79, NWBIR79
sf classMapping the layer
sf class
sfg is a single geometry which can be of multiple types
st_pointst_multipointst_linestringst_multilinestringst_polygonst_multipolygonst_geometrycollectionThese are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
These are the seven most common Simple Feature geometry types
The last type GEOMETRYCOLLECTION is rarely used
The Shapefile format does not support geometries of type GEOMETRYCOLLECTION
Let us create one point:
If we print the point, we get its WKT (Well-Known Text) representation
If we examine the class definition, we get:
XY is the two-dimensional geometry of the pointPOINT is the geometry type (it could be POINT, MULTIPOINT, etc.)sfg is the general class (i.e. simple feature geometry)This is what the point looks like:
This is how we create a polygon
If we print the polygon, we get its WKT representation
If we examine the class definition, we get:
This is what that polygon looks like:
We can also make a more complicated polygon:
POLYGON ((0 1, 1 0, 2 0.5, 3 0, 4 0, 5 0.5, 6 -0.5, 7 -0.5, 7 1, 0 1))
This is what the class looks like:
This is what that polygon looks like:
We can combine geomteries using the c function
This will result into a single MULTIPOLYGON geometry, composed of all the shapes in the input
MULTIPOLYGON (((0 0, 0 -1, 7.5 -1, 7.5 0, 0 0)), ((0 1, 1 0, 2 0.5, 3 0, 4 0, 5 0.5, 6 -0.5, 7 -0.5, 7 1, 0 1)))
If we check the class of ab, we see that it is a MULTIPOLYGON
This is what the combined polygon looks like:
A new geometry can be calculated by applying various functions to existing ones.
GEOMETRYCOLLECTION (POLYGON ((7 0, 7 -0.5, 6 -0.5, 5.5 0, 7 0)), LINESTRING (4 0, 3 0), POINT (1 0))
Notice the difference from the combined geometries: ab
MULTIPOLYGON (((0 0, 0 -1, 7.5 -1, 7.5 0, 0 0)), ((0 1, 1 0, 2 0.5, 3 0, 4 0, 5 0.5, 6 -0.5, 7 -0.5, 7 1, 0 1)))
Also, notice the difference between classes:
This is what the intersected geometries look like:
This is how they differ from the combined geometries:
sfcLet us create two more points: pt2 and pt3.
Geometry objects can be collected into a geometry column - sfc, which is done with st_sfc
A geometry column also contains a Coordinate Reference System
sfcFour types of CRS definitions are acceptable
4326)crs object, as returned by the st_crs function'+proj=longlat +datum=WGS84 +no_defs')If the crs argument is omitted, the default is to set an “undefined” CRS (i.e., NA).
It is ideal to use EPSG codes.
EPSG Geodetic Parameter Dataset is a public registry of geodetic datums, spatial reference systems, Earth ellipsoids, coordinate transformations and related units of measurement.
It came from a member of the European Petroleum Survey Group in 1985.
Different countries have different codes.
For example, the standard for the whole world is 4326.
The code for Italy is 6875.
This is from https://epsg.io
EPSG Geodetic Parameter Dataset is a public registry of geodetic datums, spatial reference systems.
sfcWe can combine pt1, pt2, and pt3.
Reminder: this is how we defined them:
This is how we combine them:
Geometry set for 3 features
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 34.79844 ymin: 31.06862 xmax: 35.01164 ymax: 31.26028
Geodetic CRS: WGS 84
POINT (34.79844 31.24329)
POINT (34.81283 31.26028)
POINT (35.01163 31.06862)
Note the we added 4326 - the code for the whole world.
sfcThe combined points look like below
sfWe can add attributes to the geometry in the following way:
sfWe can combine the attribute table name_df and the geometry column geom (sfc) using st_sf, resulting in a layer
Simple feature collection with 3 features and 1 field
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 34.79844 ymin: 31.06862 xmax: 35.01164 ymax: 31.26028
Geodetic CRS: WGS 84
name combined
1 Point 1 POINT (34.79844 31.24329)
2 Point 2 POINT (34.81283 31.26028)
3 Point 3 POINT (35.01163 31.06862)
sf layer with plotsf layer with ggplotsf layer with ggplotAs you might imagine, we can also plot the polygons that we made previously with ggplot
Here is what ab looks like:
Sometimes, we are interested in going the other way around and extract:
The following command allows us to extract the geometry
The following command allows us to drop the geometry
We are then left just with the attribute table:
The coordinates of sf, sfc or sfg objects can be obtained with the function st_coordinates.
The coordinates are a matrix:
We can turn the matrix into a dataframe in the following way
We can create point layers from tables.
Here is an example below:
We can create point layers from tables.
Here is an example below:
name addr.street lat lon
1 Pizzeria ai Marmi Viale di Trastevere 41.88826 12.47379
2 Sichuan Haozi Via di San Martino ai Monti 41.89580 12.49948
3 Dar filettaro a Santa Barbara Largo dei Librari 41.89467 12.47370
We can create point layers from tables.
Here is an example below:
name addr.street lat lon
1 Pizzeria ai Marmi Viale di Trastevere 41.88826 12.47379
2 Sichuan Haozi Via di San Martino ai Monti 41.89580 12.49948
3 Dar filettaro a Santa Barbara Largo dei Librari 41.89467 12.47370
The restaurants dataframe can be converted to a sf layer using st_as_sf:
We can create point layers from tables.
Here is an example below:
name addr.street lat lon
1 Pizzeria ai Marmi Viale di Trastevere 41.88826 12.47379
2 Sichuan Haozi Via di San Martino ai Monti 41.89580 12.49948
3 Dar filettaro a Santa Barbara Largo dei Librari 41.89467 12.47370
The restaurants dataframe can be converted to a sf layer using st_as_sf:
We can create point layers from tables.
Here is an example below:
name addr.street lat lon
1 Pizzeria ai Marmi Viale di Trastevere 41.88826 12.47379
2 Sichuan Haozi Via di San Martino ai Monti 41.89580 12.49948
3 Dar filettaro a Santa Barbara Largo dei Librari 41.89467 12.47370
The restaurants dataframe can be converted to a sf layer using st_as_sf:
Simple feature collection with 3 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 12.4737 ymin: 41.88826 xmax: 12.49948 ymax: 41.8958
Geodetic CRS: WGS 84
name addr.street
1 Pizzeria ai Marmi Viale di Trastevere
2 Sichuan Haozi Via di San Martino ai Monti
3 Dar filettaro a Santa Barbara Largo dei Librari
geometry
1 POINT (12.47379 41.88826)
2 POINT (12.49948 41.8958)
3 POINT (12.4737 41.89467)
This is what these points look like, if we map them out:
Without some context, it is difficult to see what those points
You can see below an updated map with the streets in the background.
sf properties: rows and columnsSome important properties of sf objects include rows and columns.
Number of rows:
Number of columns:
We can obtain both rows and column by typing:
sf properties: bounding box coordinatesThe st_bbox function returns the bounding box coordinates
sf properties: crsThe st_crs function returns the Coordinate Reference System (CRS)
Coordinate Reference System:
User input: EPSG:4326
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
Selecting only the restaurants that have an address:
Selecting only the restaurants that have an address:
Selecting only the restaurants that have an address:
Here is how the two dataframes compare:
Subsetting columns is very similar to subsetting a dataframe.
The difference is the geometry column that stays.
We turn the spatial dataframe into a regular dataframe.
This is how the two compare:
Simple feature collection with 3 features and 2 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 12.4737 ymin: 41.88826 xmax: 12.49948 ymax: 41.8958
Geodetic CRS: WGS 84
name addr.street
1 Pizzeria ai Marmi Viale di Trastevere
2 Sichuan Haozi Via di San Martino ai Monti
3 Dar filettaro a Santa Barbara Largo dei Librari
geometry
1 POINT (12.47379 41.88826)
2 POINT (12.49948 41.8958)
3 POINT (12.4737 41.89467)
As discussed, the most common vector files are:
| Type | Format | File extension |
|---|---|---|
| Binary | ESRI Shapefile | .shp, .shx, .dbf, .prj |
| GeoPackage (GPKG) | .gpkg |
|
| Plain Text | GeoJSON | .json or .geojson |
| GPS Exchange Format (GPX) | .gpx |
|
| Keyhole Markup Language (KML) | .gpx |
|
| Spatial Databases | PostgreSQL / PostGIS |
Here is how we would would read all these different files:
Reading a shape file
Reading a geojson file
Reading a from a geodatabase
Popescu (JCU): Lecture 10